Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for not defaulting to master branch for git deps #8522

Merged
merged 6 commits into from
Jul 27, 2020

Conversation

alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Jul 20, 2020

This PR is the "equivalent" of #8503 for the nightly channel of Rust (the master branch of Cargo). The purpose of this change is to fix the breakage from #8364 but to still pave a path forward to enabling the feature added in #8364, namely reinterpreting dependency directives without a branch directive as depending on HEAD insead of depending on master.

This is a series of commits which implements a few mitigation strategies, but they're all adding up into a larger plan to roll out this change with, ideally, no breaking changes for those "reasonably" keeping up with Rust. The changes here are:

  • Cargo still internally differentiates between DefaultBranch and Branch("master"), so it knows what you wrote in the manifest. These two, however, hash and equate together so everything in Cargo considers them equivalent.
  • Cargo will now issue a warning whenever it fetches a repository pinned to DefaultBranch and the branch's master branch doesn't actually match HEAD. This avenue of breakage didn't arise but I added it here for completionism about not having to deal with breakage from this change again.
  • Cargo has now implemented a new future lockfile format. Internally this is dubbed "v3". The changes in this format is that it will have a version marker at the top of the file as well as encoding git dependencies different. Today DefaultBranch and Branch("master") encode the same way, but in the future they will encode differently.
  • Cargo now has a warning where if you have a mixture of DefaultBranch and Branch("master") in your build. The intention here is to get everyone on one or the other so the transition to the new lock file format can go smoothly.

With all of these pieces in place the intention is that there is no breakage from #8364 as well. Additionally projects that would break will receive warnings. (note that projects "broken" today, those we've got issues for, will likely not get warnings, but that's because it was accidental we broke them). The long-term plan is to let this bake for quite some time, and then as part of the same change land an update to the new lock file format as well as "fixing" SourceId to consider these two directives different. When this change happens we should have guaranteed, for all currently active projects, they're either only using branch = "master" or no directive at all. We can therefore unambiguosly read the previous Cargo.lock to transition it into a new Cargo.lock which will either have ?branch=master or nothing.

I'm not sure how long this will need to bake for because unfortunately version changes to Cargo.lock cannot be taken lightly. We haven't even bumped the default format for old lock files to V2 yet, they're still on V1. Hopefully in getting this in early-ish, though, we can at least get the ball rolling.

Closes #8468

@rust-highfive
Copy link

r? @Eh2406

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 20, 2020
@alexcrichton
Copy link
Member Author

r? @ehuss

@rust-highfive rust-highfive assigned ehuss and unassigned Eh2406 Jul 20, 2020
@alexcrichton alexcrichton force-pushed the revert-master branch 2 times, most recently from a1213b2 to c18e51f Compare July 20, 2020 22:06
@@ -578,6 +596,10 @@ impl<'a> ser::Serialize for Resolve {
root: None,
metadata,
patch,
version: match self.version() {
ResolveVersion::V3 => Some(1),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it rather start at 3? Would be less confusing IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for someone reading Cargo's source code, but for everyone else seeing changes to their Cargo.lock for the first time it's probably much less surprising to start at 1 instead of 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v2 format has changes to Cargo.lock as well, in fact much larger ones. People wouldn't be seeing it for the first time. But ultimately this question is bikeshedding so I won't continue discussing it. Your choice. shrug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are bikeshedding it, I'd go for 3. I'm imagining a docs or blog post explaining the history of the lockfile format, for example the one explaining the corner cases where this brakes things.

"So now we get to the third change to the format. It is called version=1...
back in the first version (the one without a version specified not the one called "1")...
in the new third format (the one called "1" not the original.)"

We are making a complicated toppik harder to talk about. On the other hand if we start with 3, that blog post will have to explain what happened to 1 and 2. That feels like a strateforered one sentence explanation.

", which predates the version field."

Anyway defintly a bikeshed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's absolutely a bikeshed, but I think on balance it'll make Cargo a little more impenetrable for new contributors if the version numbers don't match. I think I'd rather see an explanation once for why we're starting with 3 (and what happened to 2), rather than having a mismatch for all time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting back into this sub-discussion for this one comment only. The format has changed prior to the "lockfiles 2.0" change. Some examples:

Not guaranteeing for completeness of this list. I'm not sure what the "perfect" version number should be. But if #7070 says that it introduced lockfiles v2, even if that was technically wrong, it's better to continue with 3 than jumping over version numbers or starting with 1 again.

I've also wondered about semver lockfile versions where the major version indicates compatibility breakage (like the one which this PR introduces) and the minor version indicates small changes that don't break older cargo's but cargo may not generate the lockfile 100% the same way, e.g. #6180 would be one such change. If cargo supports writing lockfiles with the major version but the minor version is unsupported, it should do a more thorough check than a strcmp and only update if needed (to prevent needless back and forth, and only have back and forth if there is a different reason to update the lockfile). Basically I propose getting control flow into the body of this if check if the minor version is unsupported but the major is. The reason why I haven't proposed this earlier was because I doubt that there could be minor version increases like the generated comment addition in the future. Maybe my imagination is limited, idk. The version can be made semver in the future as well, it'd require breaking support for old cargo, aka a major number increase, and mess up the nice error messages for old cargo (turn "this version isn't supported" into "expected number, got string"). It's really a bikeshed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just gone ahead and flagged this as version = 3

Copy link
Member

@est31 est31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alexcrichton for taking the time of writing this great PR! It's really great work! It looks like it's resolving the concerns from #8468 while still preparing for a future with support for non-master default branches.

From what I could gather, currently it doesn't support the new lockfile format (yet) in the sense that it fetches HEAD, right? If a cargo of this PR encounters a lockfile that a newer cargo built with a commit not on the master branch but in the main branch it'll still recognize that though, and only "break" on a cargo update, am I correct with this?

if head != master {
config.shell().warn(&format!(
"\
fetching `master` branch from `{}` but the `HEAD` \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning will fire in the case when someone has switched to the main branch since and the master branch is outdated, right? In that case, the suggestion might be wrong, maybe what the user wants is branch = "main". Cargo can't tell the new default branch but hinting at the possibility in the warning would be great (technically, cargo could list all branches and filter the ones which match with HEAD but that would require one would have to fetch basically all branches which is probably not something you'd want to do :p).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the warning will fire in that case, but Cargo is not using HEAD and is instead using the dependency as-if branch = "master" was specified. The suggestion here is to write down in Cargo.toml how to perform the same build in the future. It's true though that users may prefer to switch to branch = "something-else" if their repository has both a master and default other branch.

In general the warnings in this PR aren't great. I've attempted to at least get the bare minimum warnings emitted but honestly it's pretty difficult to get much fancier than that. This behavior of unifying dependencies and fetching is very deep within Cargo and is (at least from what I was doing) difficult to connect to "go change this line in the manifest". Much of this is that Cargo isn't preserving that kind of information to assist in debugging since this has come up rarely as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about "or alternatively place branch = \"<main>\" if you want to depend on the non-master main branch of a repository." or something like that, but it's very wordy and the warning is probably already a lot of text to sift through.

@alexcrichton
Copy link
Member Author

The resolver change to either unify or not unify DefaultBranch and Branch("master") is disconnected from the lockfile versioning, yes. At this time I can't really think of an easy way to fix that. It does mean though, yes, that Cargo after this PR reading a newer lockfile will still effectively forbid branch = "master" and no branch at the same time. My hope is that we can get to a steady state where projects either have branch = "master" or not. At that point we can then start to consider them differently and write a new lock file format at the same time. Then projects, on nightly, who use this new feature will be using the new lock file format and won't be using beta/stable anyway because they're using a new Cargo feature.

@est31
Copy link
Member

est31 commented Jul 21, 2020

My hope is that we can get to a steady state where projects either have branch = "master" or not

Oh right there is the issue of projects having such hybrid uses as well, didn't think about it. How does that relate to version 3 lockfile formats?

At that point we can then start to consider them differently and write a new lock file format at the same time.

What do you mean by that? Version 4 lockfile format? If cargo as of this pr unifies branch = "master" and DefaultBranch and a future version doesn't, it will cause a difference in the version 3 lockfile, no? And if you add version 4 lockfile formats, then why need version 3 at all? As in: why add support to version 3 lockfiles at this point and not later on?

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting in all this work!

//! longer compatible on stable and nightly with each other. The underlying
//! issue is that Cargo was serializing `branch = "master"` *differently* on
//! nightly than it was on stable. Historical implementations of Cargo
//! desugared `branch = "master"` to having not dependency directives in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble reading this sentence. Can the wording bet tweaked somehow?

IIUC, it desugared branch = "master" to not having a branch in the source definition? I'm not sure what "dependency directives" means in this context.

@alexcrichton
Copy link
Member Author

My plan with Cargo is to eventually land a change which differentiates DefaultBranch and Branch("master") in the resolver. That will just be a flat-out breaking change in Cargo. At this time at least there's really no feasible way to realistically conditionalize this resolver behavior based on the lock file format. This means that in the future Cargo will actually start to allow a resolution graph with both of these resolve nodes. If that is encountered then Cargo will always use the v3 lock file format. Using this feature (depending on those two git deps) will simply be a feature enabled in new Cargo not supported by old Cargo. As with all Cargo features if you use a feature in a new Cargo you can't use older Cargo to support that.

I don't think there's any need for a v4 format for this change, I plan to stop here.

@ehuss updated some words!

This commit is intended to be an effective but not literal revert
of rust-lang#8364. Internally Cargo will still distinguish between
`DefaultBranch` and `Branch("master")` when reading `Cargo.toml` files,
but for almost all purposes the two are equivalent. This will namely fix
the issue we have with lock file encodings where both are encoded with
no `branch` (and without a branch it's parsed from a lock file as
`DefaultBranch`).

This will preserve the change that `cargo vendor` will not print out
`branch = "master"` annotations but that desugars to match the lock file
on the other end, so it should continue to work.

Tests have been added in this commit for the regressions found on rust-lang#8468.
This commit implements a warning in Cargo for when a dependency
directive is using `DefaultBranch` but the default branch of the remote
repository is not actually `master`. We will eventually break this
dependency directive and the warning indicates the fix, which is to
write down `branch = "master"`.
This commit lays the groundwork for an eventual V3 of the lock file
format. The changes in this format are:

* A `version` indicator will be at the top of the file so we don't have
  to guess what format the lock is in, we know for sure. Additionally
  Cargo now reading a super-from-the-future lock file format will give a
  better error.

* Git dependencies with `Branch("master")` will be encoded with
  `?branch=master` instead of with nothing.

The motivation for this change is to eventually switch Cargo's
interpretation of default git branches.
This commit implements a simple warning to indicate when `DefaultBranch`
is unified with `Branch("master")`, meaning `Cargo.toml` inconsistently
lists `branch = "master"` and not. The intention here is to ensure that
all projects uniformly use one or the other to ensure we can smoothly
transition to the new lock file format.
@ehuss
Copy link
Contributor

ehuss commented Jul 27, 2020

I've gone over this several times, and I can't think of any issues that haven't already been highlighted. Thanks again Alex!
@bors r+

@bors
Copy link
Contributor

bors commented Jul 27, 2020

📌 Commit 7f65cae has been approved by ehuss

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 27, 2020
@bors
Copy link
Contributor

bors commented Jul 27, 2020

⌛ Testing commit 7f65cae with merge b49ccad...

@bors
Copy link
Contributor

bors commented Jul 27, 2020

☀️ Test successful - checks-actions
Approved by: ehuss
Pushing b49ccad to master...

@bors bors merged commit b49ccad into rust-lang:master Jul 27, 2020
@bkchr
Copy link

bkchr commented Jul 28, 2020

Thank you both!

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 30, 2020
Update cargo

14 commits in aa6872140ab0fa10f641ab0b981d5330d419e927..974eb438da8ced6e3becda2bbf63d9b643eacdeb
2020-07-23 13:46:27 +0000 to 2020-07-29 16:15:05 +0000
- Fix O0 build scripts by default without `[profile.release]` (rust-lang/cargo#8560)
- Emphasize git dependency version locking behavior. (rust-lang/cargo#8561)
- Update lock file encodings on changes (rust-lang/cargo#8554)
- Fix sporadic lto test failures. (rust-lang/cargo#8559)
- build-std: Fix libraries paths following upstream (rust-lang/cargo#8558)
- Flag git http errors as maybe spurious (rust-lang/cargo#8553)
- Display builtin aliases with `cargo --list` (rust-lang/cargo#8542)
- Check manifest for requiring nonexistent features (rust-lang/cargo#7950)
- Clarify test name filter usage (rust-lang/cargo#8552)
- Revert Cargo Book changes for default edition (rust-lang/cargo#8551)
- Prepare for not defaulting to master branch for git deps (rust-lang/cargo#8522)
- Include `+` for crates.io feature requirements in the Cargo Book section on features (rust-lang/cargo#8547)
- Update termcolor and fwdansi versions (rust-lang/cargo#8540)
- Cargo book nitpick in Manifest section (rust-lang/cargo#8543)
@ehuss ehuss mentioned this pull request Jul 31, 2020
bors added a commit that referenced this pull request Nov 2, 2020
Normalize SourceID in `cargo metadata`.

The SourceID in `cargo metadata` can have different values, but they can be equivalent in Cargo. This results in different serialized forms, which prevents comparing the ID strings. In this particular case, `SourceKind::Git(GitReference::Branch("master"))` is equivalent to `SourceKind::Git(GitReference::DefaultBranch)`, but they serialize differently.

The reason these end up differently is because the `SourceId` for a `Package` is created from the `Dependency` declaration. But the `SourceId` in `Cargo.lock` comes from the deserialized file. If you have an explicit `branch = "master"` in the dependency, then versions prior to 1.47 would *not* include `?branch=master` in `Cargo.lock`. However, since 1.47, internally Cargo will use `GitReference::Branch("master")`.

Conversely, if you have a new `Cargo.lock` (with `?branch=master`), and then *remove* the explicit `branch="master"` from `Cargo.toml`, you'll end up with another mismatch in `cargo metadata`.

The solution here is to use the variant from the `Package` when serializing the resolver in `cargo metadata`. I chose this since the `Package` variant is displayed on other JSON messages (like artifact messages), and I think this is the only place that the resolver variants are exposed (other than `Cargo.lock` itself).

I'm not convinced that this was entirely intended, since there is [code to avoid this](https://github.com/rust-lang/cargo/blob/6a38927551df9bbe2dea340bf92d3e53abccf890/src/cargo/core/resolver/encode.rs#L688-L695), and at the time #8522 landed, I did not realize this would change the V2 lock format. However, it's probably too late to try to reverse that, and I don't think there are any other problems other than this `cargo metadata` inconsistency.

Fixes #8756
alexcrichton added a commit to alexcrichton/cargo that referenced this pull request Feb 4, 2021
This commit follows through with work started in rust-lang#8522 to change the
default behavior of `git` dependencies where if not branch/tag/etc is
listed then `HEAD` is used instead of the `master` branch. This involves
also changing the default lock file format, now including a `version`
marker at the top of the file notably as well as changing the encoding
of `branch=master` directives in `Cargo.toml`.

If we did all our work correctly then this will be a seamless change.
First released on stable in 1.47.0 (2020-10-08) Cargo has been emitting
warnings about situations which may break in the future. This means that
if you don't specify `branch = 'master'` but your HEAD branch isn't
`master`, you've been getting a warning. Similarly if your dependency
graph used both `branch = 'master'` as well as specifying nothing, you
were receiving warnings as well. These two situations are broken by this
commit, but it's hoped that by giving enough times with warnings we
don't actually break anyone in practice.
@alexcrichton alexcrichton deleted the revert-master branch February 4, 2021 18:46
bors added a commit that referenced this pull request Feb 9, 2021
Change git dependencies to use `HEAD` by default

This commit follows through with work started in #8522 to change the
default behavior of `git` dependencies where if not branch/tag/etc is
listed then `HEAD` is used instead of the `master` branch. This involves
also changing the default lock file format, now including a `version`
marker at the top of the file notably as well as changing the encoding
of `branch=master` directives in `Cargo.toml`.

If we did all our work correctly then this will be a seamless change.
First released on stable in 1.47.0 (2020-10-08) Cargo has been emitting
warnings about situations which may break in the future. This means that
if you don't specify `branch = 'master'` but your HEAD branch isn't
`master`, you've been getting a warning. Similarly if your dependency
graph used both `branch = 'master'` as well as specifying nothing, you
were receiving warnings as well. These two situations are broken by this
commit, but it's hoped that by giving enough times with warnings we
don't actually break anyone in practice.
@ehuss ehuss added this to the 1.47.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo build updates the git commit
8 participants